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Abstract 



In quantum computation we are given a finite set of gates and we have 
to perform a desired operation as a product of them. The corresponding 
computational problem is approximating an arbitrary unitary as a prod- 
rr** uct in a topological generating set of SU(d). The problem is known to 

be solvable in time polylog(l/e) with product length polylog(l/e), where 
the implicit constants depend on the given generators. The existing algo- 
, rithms solve the problem but they need a very slow and space consuming 

. preparatory stage. This stage runs in time exponential in d 2 and requires 

memory of size exponential in d 2 . In this paper we present methods which 
make the implementation of the existing algorithms easier. We present 
heuristic methods which make a time-length trade-off in the preparatory 
step. We decrease the running time and the used memory to polynomial 
in d but the length of the products approximating the desired operations 
will increase (by a factor which depends on d). We also present a simple 
method which can be used for decomposing a unitary into a product of 
group commutators for 2 < d < 256, which is an important part of the 
existing algorithm. 

1 Introduction 

The Solovay-Kitaev theorem ([5], Section 8) asserts that if a set G in SU(d) 
(with some simple properties) generates a dense subgroup in SU(d) then this set 
fills up SU (d) quickly. It means that we can approximate an arbitrary unitary 
U G SU(d) with a short product of operations from G. Let G = {ffi, <?2, ■ ■ -3n} 
be a generating set for SU(d). We would like to approximate an arbitrary 
U G SU(d) with arbitrary precision e, we need a sequence of gi u ,g2 u , ■ ■ -gmu 
(gi v G G) which satisfy the inequality \\U — Y\™ =1 gi u \\ — e - ^ n M we can nn d 
an algorithm which produce this product by constructing recursive coverings of 
the neighborhoods of the identity. In pQ we can find a nice and transparent 
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version of the Solovay-Kitaev algorithm. These methods require a preparatory 
stage (which must be performed only once for a fixed generating set G) which 
generates a set T of products from G up to a fixed length, which depends on 
d. This step needs huge computational efforts for higher d. The set T gives an 
initial eo-covering of SU (d) but the size of T is exponential in d 2 and hence the 
time of generating T is exponential in d 2 too. It needs a huge computational 
effort for d > 2, that is in the case of more than a single qubit. 
In £Q the initial covering needs eo < l/(8y/d(d— l)/2). In |3| the algorithm uses 
a universal eo which is a power of 1/20. 

In this paper we suggest a data structure and heuristic methods for producing 
r in polynomial time and space in d. The payoff is that the length of a product 
approximating a given operation is increased by a factor which depends on 
d. Combining this version of the preparatory stage with the second stage of 
the algorithms mentioned above will give an approximation for an arbitrary 
unitary in SU(d) as a product from G where the length of the product is still 
polylogarithmic in 1 /e but the constant factor will be increased. The method is 
scalable in the sense that the speed of the preparatory step can be chosen and 
so can the factor which increases the length of the approximating product. 

The main idea is adopting the methods "Shrinking", "Telescoping" and 
"Zooming in" which are described in section 8 in |3J. In Section |3 we define 
the basic notions which we will use in this paper. In Section [3] we describe our 
methods and we present some computational results. In Section 21 we extend 
the algorithm described in pQ by giving an alternative method to decompose 
a unitary. It is useful because the method presented in pQ induces matrix di- 
agonalization which appears to be difficult to implement in some systems for 
symbolic computation (for instance GAP [2|). Increasing the quality of the ini- 
tial covering we can make the decomposition avoiding matrix diagonalization. 

2 The main tools 

We review the main ideas of the Solovay Kitaev algorithm. First, we specify 
the requirements regarding the generating set G. 

Convention 2.1 We say that G is a generating set for SU(d) if the following 
hold: 

1. G C SU(d) and G generates a dense subgroup in SU(d). 

2. If g € G then g^ 1 = E G. 

The first requirement is natural as we have to approximate an arbitrary 
element of SU(d) by an element from the semigroup generated by G. (Note 
that a closed subsemigroup of SU(d) is a subgroup.) The second requirement 
is technical. However, it is not known what we can say about the length of a 
product in an approximation if G does not satisfy the second condition. However, 
in quantum computation the second condition is usually not really restrictive. 
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For measuring the distance in SU (d) we use the operator norm. 
IfU,V€ SU(d) thend(U,V) = \\U-V\\ where \\U\\ = sup H=1 \Ux\. It satisfies 
the following properties: 

1. d(U, V)>0 and d(U, V) = iff U = V, 

2. d(U,V) = d(V,U), 

3. d(U, V) < d(U, W) + d(W, V), 

4. d(UW, VW) = d{WU, WV) < d(U, V) (assuming W is unitary), 

5. if d(Ui,U-) < 5i (i = then d(JlLi U ^ ULi U 'i) ^ E|=x **» as " 
suming that Ui, U[ are unitary. 

We recall the following lemma from P^. 

Lemma 2.2 1. IfA,B,A,B are unitaries such that 
d(A,v4'),d(B,B') < 5i, d(I,A),d(I,B) < 6 2 then 

d([A,B],[A',B']) < $5i52 + 45i8% + 88l + 45f + Sf. 

2. If A, B are Hermitian matrices and \ \A\\, \ \B\\ < S then 

d(exp(i(A + B)), exp(iA) exp(iB)) < S 2 . 

3. If A, B are Hermitians such that \ \A\\, \ \B\ \ < S then 

d(exp(«A) exp(iB) exp(-iA) exp(-iB), exp([iv4, iB])) < AS 3 . 

4- If A is Hermitian then d(exp(iA), I) < \\A\\. 

The proof of the first statement can be found in P^. The proof of the other 
statements can be found in j3| and 

We adopt the notion of nets from 0. 

Definition 2.3 • Let T,H C SU(d). Then T is a S-net for H if for all 

h G H there exist 7 € T that d{^y, h) < 5. 

• An (r,6)-net in SU(d) is a 5-net for the r -neighborhood of the identity 
which denoted by S r . 

• Let r be a 5-net for H. Then T is a-sparse if for all 7 € T there exist 
h £ H such that d{h 1 7) < 5 and for all 71,72 6 T we have d(p(\, 72) > aS. 

• For an (r,S)-net the ratio q = r/5 is called the quality of the net. 
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As a part of the Solovay-Kitaev theorem in [3] it was proved that any e-net 
in a compact semisimple Lie group generates a dense subgroup if e is small 
enough. The proof is constructive in the sense that there is an algorithm to 
generate an e'-net from an e-net for an arbitrary e' < e. The algorithm pre- 
sented in pP works in a reverse order: it recursively decomposes a unitary into 
a group commutator in which the components are close to the identity and it 
approximates the commuting elements using an e-net. 

We will make use of some basic properties of the nets (shown in 

Lemma 2.4 1. (Telescoping) 

Let Ti be an (r 1; 5i)-net, T 2 be an (r 2 , <5 2 )-ne£ ; where b\ < r 2 . 
Then TiT 2 = {U1U2 •' U\ G 1^, U 2 G T 2 } is an (7*1, S 2 )-net. 

2. (Zooming in) 

Let ro,ri,...r„ C SU(d) be nets, where Ti is an (r.i, Si)-net and Si < 
r.; + i. If V E S r „ then V can be approximated by U = UqU\ ■ ■ ■ U n where 
Ui E Ti and d(U,V) < S n . 

The main problem is that an (r, r/q)-net has at least q 0< - d2 ^ points. It follows 
from the fact that in SU(d) the volume of a sphere with diameter S is 0(6 d 
But q°( d ) points are enough for an (r,r/q)-net, it follows from the fact that 
one can make an (r, r/g)-net in C d with q°( d ) points. For the purposes of the 
second stage an initial covering of SU(d) with quality q > 20 is required. (Note 
that he diameter of SU(d) is 2 in the operator norm, i.e. d(x, y) < 2 for all 
x,yeSU(d).) 

3 The preparatory stage 

To illustrate the use of zooming in, assume that T is an e-net for the entire 
SU(d). Thus T has l/e°( rf2 ) elements. 

We would like to show that the number of matrices in T can be decreased, 
but the length of products which produce the elements of T will increase. 

Wc define a sequence of nets r, (i = 0, 1 . . . [(1 + log(l/e))/ log(g)] , where 
1< q < 1/e) by 

^ = {7 e r I 2/g* < d( % I) < 2/q* +1 } u {/}. 

Each Ti will be an 2/g l+1 -net for S 2 / q i \ S^/qi+y, so each Ti is an (7^, rj/g)-net 
where r$ — 2, and r^ + i = ri/q. Then sparsening these sets we obtain (ri,ri/q)- 
nets where each has q°( d - 1 elements. 

Using Zooming in (defined in Lemma I2.4fl we get an e-net as a chain of 
(n,ri/q)-nets Tj (i = .. .k = (1 + log(l/e))/log(g r )) with "poor" quality q. In 
this case the length of a product in Ti is not more than the maximal length of a 
product in the e-net. So the length of a product using the Telescoping structure 
instead of the e-net T will be increased by a factor k = (l+log(l/e))/ log(q). The 
cardinality of \JTi = k ■ q°^ d >. With an appropriate choice of q (i.e. q = S/2) 
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we can ensure that | |J Tj| is polynomial in d, however in this case the length of 
the products will increase. 

Computational experience shows that it is difficult to construct nets with 
"very poor" quality, i.e. q < \/2 where m >> d 2 . 

The most important problem is constructing such a sequence of nets quickly. 
The only known accurate method for constructing a base e-net is to compute 
and store all products from G up to a fixed length. The problem is difficult 
in the sense that we do not know anything about G and the problem strongly 
depends on the properties of G. In general, without any assumption about G it is 
the only known method. Consider the case when d(g, I) < 5 for all g £ G, then 
the distance between the identity and an n-length product will be at most nS. 
But computing all products and then sparsening it leads to the same problem: 
storing a huge amount of matrices. 

We propose a heuristic method which speeds up the construction of a se- 
quence of (ri,ri/q) -nets by increasing the length of the products in approxima- 
tions (by an additional factor). We construct the nets in parallel. 

The main heuristic algorithm 

Let G, q and e be fixed, where l<g<l/eas above (i.e. q rs d \/2) and we 
assume that we have an initial (2,2/g)-net Tq. 

Let k = |~(1 + log(l/e))/log(g)] , and let Ti, r 2 , . . . ,Tk be empty sets. At the 
end of the procedure each I\ will be hopefully an {2/q l , 2/g l+1 )-net. At first for 
all g £ G let L, = T l U {g} iff 2/q l > d(g, I) > 2/q l+1 (i = . . . k). 

In each step we increase the cardinality of the set (Ji=o ■ ^ we cann °t 
increase this cardinality then the algorithm terminates. The set Ti remains a 
subset of an (2/V, 2/V +1 )-net after each step, and we assume that r^s are sparse 
(Definition 12. so each Ti has at most q°^ d ^ elements after each step. 

k 

In each step of the algorithm we calculate the products H = G • 1J T^. It is 
easy to check that 

|ff|<|G|.^|r^ n\G\-k-q°V 2 \ 

k 

Let h € H (where h = gj, g e G, 7 <E 1J Ti) and let S = d(I, h). We check 

i=Q 

the following properties for all elements of H : if there is an index i with 2/q l > 
S > 2/q l+l then check if there is an element 7, £ Ti with (£(7,, h) < 2/q 1+1 . If 
there is no such 7^ then = U {h} and we continue the algorithm with an 
another element of H. Otherwise we divide h by the element 7^, in this case 
d(H rl »*) ^ 2 /V +1 so fryr 1 is a candidate for membership in Tj for some j > i. 
We check the same property for /i7 4 _1 , and so on. If d(/i7 l ~ 1 ,/) < e then we do 
not use this element, elsewhere we increase the cardinality of (J Ti by adding 
h"f~ l . We continue the method until we cannot get new elements. 
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This method increases the length of the products (because of dividing), but 
we can control this length by choosing a maximal length L and we will not use 
a product which has length more then L, where L can be chosen c • log(l/e) for 
some constant c. 

Our computational experience shows that in many cases once the algorithm 
terminating, each of the I\ is a (2/q l , 2/q t+1 )-nct. Of course, the correctness and 
the performance of the method strongly depends on the properties of G. The 
assumption that we have an initial (2, 2/g)-net T is technical. If q is close to 1 
(q w V2) , then the cardinality of To does not depend on d and we can construct 
it quickly. The interpretation of this assumption is that we need some elements 
which are far from the identity (the method fills up the nets downward). 

As in each step we get a new element (except when the element is closer to 
the identity then e or the length is more then L) the running time of the method 
is 0(J2 = 0((1 +log(l/e))/log(<?) • |r 4 |. Using the Telescoping method we 
get an e-net where the length of the products is less then L- (l+log(l/e))/ log(g). 

The algorithm will fail when G has some "bad" properties. For example 
consider the following case: Let G/ be a finite matrix group such that for each 
g € G there exist g' £ G/ such d(g,g') < e. A product of length n from G is at 
most ne far from G'. 

A special case is when for each g e G the distance d(g,I) < e. Let U be 
an arbitrary unitary from the group generated by G. In this case d(U, Ug) = 
d(I, g) < e so we can not get new elements with the above algorithm for arbitrary 
I> 

We tested the algorithm for the usual generating set of SU (d) consisting of 
the Hadamard-gate, K-gate, 7r/8-gate and the CNOT-gate. The quality q was 
selected from \/2 to V2~ and d was selected from {2,4,8}. We obtained that 
we could produce an e-net with this method quickly (mainly because the size of 
the data structure representing the net is not exponential in d 2 ). With q = \/2 

the size of will be <ff (d2 \ k = d ■ log(l/e) and L = 0(d 2 ■ log(l/e)). Using 
the method of Telescoping the length of a product will be at most kL. With 
the choice q = d \[2 we obtained that | — c where c depends on G. In our test 
cases 5 < c < 40. 

A complementary method for further possible speedup 
Let L be an (r, r/g)-net where q > 4. This method produces a set of elements F 
and computational results show that T is often an (r/q, r/q 2 )-net. The length 
of a product in T is three times the length of the products in T. 

The method is the following, let H = F n S r/2 = {U € T : d{U, I) < r/2}. If 
q > 4 then H is not the empty set, and so H is an (r/2, (r/2)/(g r /2))-net. We 
have HH C S r since 

d(UV,I) = \\UV - I\\ = \\(U - I)V + (V - I)\\ < 

+ =d(U,I) + d(V,I) <r/2 + r/2 = r 

for U,VeH. 
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As T is an (r, r/g)-net, if UV £ HH then there exist W £ T such d(UV, W) < 
r/q and in this case d(E/VW _1 , 1) < r/q. 
Let r' = {UVW^ 1 :U,V £ H,W £T, d(UV, W) < r/q}. 

Then |r'| < \H\ ■ \H\ = (q/2)° id ^ ■ (q/2)° (d2 \ The length of a product in T is 
at most three times as large as the maximal length of a product in T. 

Obviously, T will not be a net in all cases. For instance, let T be a finite 
subgroup of SU(d) which is also a (2, 2/g)-net, but it is not a net with quality 
q' > q. The method will not work for this T. Computational results show that 
for the usual generating sets and with q > 4 the set T will be an (r/q, r/q 2 )-net 
(however, often it is much more dense than desirable). 

For testing we used the following gates as generating set G: 



Hadamard gate 



K-gate 



7r/8-gate 



CNOT-gate 
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4 Decomposing a unitary 

In this section we extend the algorithm described in £Q. We give a brief outline 
of the algorithm. The method is recursive, for a unitary U it gives an e„ approx- 
imation U n in the n-th iteration step. It decomposes the quotient A = UU~ X 
into a group commutator A = [V, W] where V and W are close to the identity 
and it performs the algorithm on V and W as well. This algorithm needs an 
eo covering for SU (d) and in each iteration step it gives an e„ approximation 
for U where e n — > as n — > oo; more precisely e„ = Ca PV rox^J-i for a constant 
Capprox- Hence 6 n ► if cq <C ]~/c a pp r0 x~ 

For a complete description of this algorithm the reader is referred to [J. A 
main step of this method is to decompose a unitary into a group commutator. 
There is an efficient method for unitaries in SU(2) and it has been proved that 
the decomposition can be made for d > 2. But for d > 2 the method described 
in £Q needs the diagonalization of a unitary in order to obtain a decomposition. 

We give a method for decomposing a unitary into a product of group com- 
mutators and we prove that the algorithm still remains correct but the constant 
Capprox w iH be increased and hence we need a better eo covering. In return, the 
decomposition can be made easier. 
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The algorithm in the (n — l)-th iteration step gives an approximation U n -i 
for a unitary U where d(U,U n -x) < e n -i- Let A = UU~\ and d(A,I) < e n -\- 
We decompose A into a product of group commutators [E^ , E^ 2 '][F^' , F^ 2 '] 
such that d(A,[E^,E^}[F^\F^]) < c gCl e 3 J\. The constant c gci will be 
specified later. The unitaries E^ , F^' are close to the identity: 

d{E^\l) < c 3C2 V^7T 
d(F^,I) < c gC2 V^T 

for j = 1,2 and for an another constant c gC2 . We perform the algorithm on 
these elements with n — 1 iteration step and we get E^l^F^ where 

d(E^,E^) < e n -i 
d(FV),F«\) < e n ^. 
The decomposition is based on the following lemma (which can be found in 

ID) 

Lemma 4.1 Let H be a traceless off-diagonal d- dimensional Hermitian matrix. 
Then we can find Hermitian F and G such that: 

[F, G] = iH, 

\\f\\,\\g\\ < d ^(^y 2 ^/m\. 

Proof: Let G be a diagonal matrix with the following entries: 

G jd = -(d-l)/2 + (j-l). In this case | \G\ \ = (d-l)/2. Let F be the following: 



J ' \ o ifj = k 

It is easy to see that [F, G] = iH, \\F\\ 2 < tr(F 2 ) < tr(H 2 ) < d\\H\\ 2 , so 
\\F\\ < Vd\\H\\. Rescaling F and G gives the desired Hermitians. From this 
lemma we can see that c gC2 will be d}^{{d — l)/2) 1 / 2 . 

If H is diagonal then conjugating it with the d-dimensional Fourier-matrix 
(or with a (i-dimensional Hadamard which can be found in the database [o] for 
all d satisfying A\d and d < 256) we get an off-diagonal matrix which can be 
decomposed. In the original algorithm we have to diagonalize the Hermitian H . 
Conjugating preserves the decomposition hence S[A, B]S~ 1 = [SAS^ 1 , SBS -1 ]. 

We make the decomposition in the following way: A is a unitary with trace- 
less Hamiltonian H and we write H = H a + Hd where H a is the off-diagonal 
part of H and Hd is the diagonal part of H. Then ||i? ||, \\Hd\\ < since 
H , Hd are still Hermitians. 

By Lemma KT\iH„ = [e (1) ,e (2) ], iH d = [J (1) ,/ (2) ] where 

l| e 0) IUI/ w ll<rf 1/2 (^) 1/2 ^M- 
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Let EM = exp(ie^) and = exp(ifW), then 



ddE^E^F^F^A)^ 
< d([^ 1 ,^ 1 ][^ 1 ,^ 1 ],[£W,^]tFW,F( 9 )]) + 
+ d{[EW,EW][FV>,F<?>],A). 

By Lemma f2. 21 and the properties of the distance function, 

rf([^ 1 ,^ 1 ][^,e ) 1 ]J^ (1) ^ (2) ][^ (1) ^ (2) ])< 

< d^^iU^^^ < 

< l&e^did-l)^) 1 ' 2 ^^. 

By Lemma f4. II 

d([E^,E^][F^,F^},A) = 
= d([exp( l e( 1 )),exp( l e( 2 ))][exp( J /( 1 )),exp( l /( 2 ))],exp( J ff)) = 

= d([exp(ie (1) ),exp(ie (2) )][exp(i/( 1 )),exp(i/( 2 ))],exp([ie (1) ,ie (2) ] + [if {1 \ if™])) < 

< d(exp([ie (1) ,ie (2) ] + [if™, if {2) ]), exp([ze (1) , ie {2) ]) exp([i/W, if^})) + 

+ d(exp([ie (1) ,ie( 2 )])exp([i/( 1 \i/( 2 )),[exp(ie (1) ),exp(ie (2) )][exp(i/( 1 )),exp(i/( 2 ))]). 

By Lemma O 

d(exp([ze (1) ,ie (2) ] + [i/ (1) ,z/ (2) ]),exp([ie (1) ,je (2) ])exp([i/ (1) ,i/ (2) ])) < d\\H\\ 2 < de 
and 

d(exp([ze (1) ,ie (2) ])exp([i/( 1 \i/^]),[exp(ie (1) ),exp(ze (2) )][exp(z/( 1 )),exp(i/( 2 ))]) < 

<8(d(d-l)/2f/h^ v 

So the constant c gci will be 8(d(d - l)/2) 3 / 2 . 
We get 

<[^ 1 ,^ 1 ][F« 1 ,F^ 1 ],A)< 
(16(d(d - l)/2) 1 /2 + ^ + 8(d(d - l)/2) 3 / 2 )el 2 1 . 

Let c approx = 16(d(d-l)/2) 1 / 2 + d + 8(d(d-l)/2) 3 / 2 and we get the desired 

form, e„ < Copprose^ < £q 3/2) ir e o < ^/cl pprox - The length of a product at 
the n-th iteration step is l n = 8Z n _i. To approximate a unitary with error e 
then we need at least 

77 ^> — _ _ 

log(3/2) ' 

The length of the product will be Z08" = 0(log(l/e)). But with this method 
the initial covering must be better then in £Q. 
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